Opus: a Systematic Search Algorithm and Its Application to Categorical Attribute-value Data- Driven Machine Learning
نویسنده
چکیده
OPUS is a branch and bound search algorithm that enables efficient systematic search through spaces in which the order in which the search operators are applied is not significant. OPUS achieves this by maximising the effect of each pruning action. While it is not possible to guarantee in the general case that any pruning shall occur, when pruning is possible, its effect is maximised. Experimental application of OPUS in data-driven machine learning demonstrates that NP hard search problems in which it is not possible to guarantee a solution in reasonable time can be solved for real world data within acceptable time frames. Indeed, OPUS is demonstrated to enable systematic search of extremely large search spaces in less time than is taken by common heuristic machine learning search algorithms. The use of systematic search in concept learning enables better experimental comparison of alternative inductive biases than was previously possible as the precise inductive bias can be described and manipulated. Such comparison reveals that the explicit biases used in some previous concept learning research have interacted with the search heuristics employed to produce implicit biases that frequently out-perform the explicit biases. Explicit formulation of these implicit biases would enable their systematic study and refinement with the potential of improving the predictive accuracy of the hypotheses inferred through machine learning.
منابع مشابه
Systematic Search for Categorical Attribute-value Data-driven Machine Learning
Optimal Pruning for Unordered Search is a search algorithm that enables complete search through the space of possible disjuncts at the inner level of a covering algorithm. This algorithm takes as inputs an evaluation function, e, a training set, t, and a set of specialisation operators, o. It outputs a set of operators from o that creates a classifier that maximises e with respect to t. While O...
متن کاملOPUS: An Efficient Admissible Algorithm for Unordered Search
OPUS is a branch and bound search algorithm that enables efficient admissible search through spaces for which the order of search operator application is not significant. The algorithm’s search efficiency is demonstrated with respect to very large machine learning search spaces. The use of admissible search is of potential value to the machine learning community as it means that the exact learn...
متن کاملDebt Collection Industry: Machine Learning Approach
Businesses are increasingly interested in how big data, artificial intelligence, machine learning, and predictive analytics can be used to increase revenue, lower costs, and improve their business processes. In this paper, we describe how we have developed a data-driven machine learning method to optimize the collection process for a debt collection agency. Precisely speaking, we create a frame...
متن کاملEnhancing Learning from Imbalanced Classes via Data Preprocessing: A Data-Driven Application in Metabolomics Data Mining
This paper presents a data mining application in metabolomics. It aims at building an enhanced machine learning classifier that can be used for diagnosing cachexia syndrome and identifying its involved biomarkers. To achieve this goal, a data-driven analysis is carried out using a public dataset consisting of 1H-NMR metabolite profile. This dataset suffers from the problem of imbalanced classes...
متن کاملData-Driven Approaches to Improve the Quality of Clinical Processes: A Systematic Review
Background: Considering the emergence of electronic health records and their related technologies, an increasing attention is paid to data driven approaches like machine learning, data mining, and process mining. The aim of this paper was to identify and classify these approaches to enhance the quality of clinical processes. Methods: In order to determine the knowledge related to the research ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1993